卷积神经网络(CNN)不仅被广泛普及,而且在包括图像分类,恢复和生成在内的许多应用中都取得了明显的结果。尽管卷积的重量共享特性使它们在各种任务中被广泛采用,但其内容不足的特征也可以视为主要缺点。为了解决这个问题,在本文中,我们提出了一个新型操作,称为Pixel自适应核(PAKA)。 Paka通过从可学习的功能中乘以空间变化的注意力来提供对滤波器重量的方向性。所提出的方法会沿通道和空间方向分别渗入像素自适应的注意图,以使用较少的参数来解决分解模型。我们的方法可以以端到端的方式训练,并且适用于任何基于CNN的模型。此外,我们建议使用PAKA改进的信息聚合模块,称为层次PAKA模块(HPM)。与常规信息聚合模块相比,我们通过在语义细分方面提出最先进的性能来证明HPM的优势。我们通过其他消融研究来验证提出的方法,并可视化PAKA的效果,从而为卷积的权重提供了方向性。我们还通过将其应用于多模式任务,尤其是颜色引导的深度图超分辨率来显示该方法的普遍性。
translated by 谷歌翻译
本文提出了一种新颖的卷积层,称为扰动卷积(PCONV),该层侧重于同时实现两个目标:改善生成的对抗网络(GaN)性能并减轻判断者将所有图像从给定数据集记住的记忆问题,因为培训进步。在PCONV中,通过在执行卷积操作之前随机扰乱输入张量来产生扰动特征。这种方法很简单,但令人惊讶地有效。首先,为了产生类似的输出,即使使用扰动的张量,鉴别器中的每层也应该学习具有小本地嘴唇尖端值的鲁棒特征。其次,由于输入张量在培训过程中随机扰乱了神经网络中的辍学时,可以减轻记忆问题。为了展示所提出的方法的泛化能力,我们对各种丢失函数和数据集进行了广泛的实验,包括CIFAR-10,Celeba,Celeba-HQ,LSUN和微型想象成。定量评估表明,在FRECHET成立距离(FID)方面,PCONV有效地提高了GaN和条件GaN的性能。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Yes. In this paper, we investigate strong lottery tickets in generative models, the subnetworks that achieve good generative performance without any weight update. Neural network pruning is considered the main cornerstone of model compression for reducing the costs of computation and memory. Unfortunately, pruning a generative model has not been extensively explored, and all existing pruning algorithms suffer from excessive weight-training costs, performance degradation, limited generalizability, or complicated training. To address these problems, we propose to find a strong lottery ticket via moment-matching scores. Our experimental results show that the discovered subnetwork can perform similarly or better than the trained dense model even when only 10% of the weights remain. To the best of our knowledge, we are the first to show the existence of strong lottery tickets in generative models and provide an algorithm to find it stably. Our code and supplementary materials are publicly available.
translated by 谷歌翻译
When developing deep learning models, we usually decide what task we want to solve then search for a model that generalizes well on the task. An intriguing question would be: what if, instead of fixing the task and searching in the model space, we fix the model and search in the task space? Can we find tasks that the model generalizes on? How do they look, or do they indicate anything? These are the questions we address in this paper. We propose a task discovery framework that automatically finds examples of such tasks via optimizing a generalization-based quantity called agreement score. We demonstrate that one set of images can give rise to many tasks on which neural networks generalize well. These tasks are a reflection of the inductive biases of the learning framework and the statistical patterns present in the data, thus they can make a useful tool for analysing the neural networks and their biases. As an example, we show that the discovered tasks can be used to automatically create adversarial train-test splits which make a model fail at test time, without changing the pixels or labels, but by only selecting how the datapoints should be split between the train and test sets. We end with a discussion on human-interpretability of the discovered tasks.
translated by 谷歌翻译
本文分析了三种具有不同韵律系统的语言的违反语音数据集:英语,韩语和泰米尔语。我们检查39个声学测量值,反映了三个语音维度,包括语音质量,发音和韵律。作为多语言分析,通过可理解水平对声学测量的平均值进行检查。此外,执行自动清晰度分类以审查语言设置的最佳功能。分析表明发音特征,例如正确的辅音百分比,正确的元音百分比以及正确的音素比例为语言无关的测量。但是,语音质量和韵律特征通常通过语言呈现不同的方面。实验结果还表明,不同的语音维度对不同的语言起着更大的作用:英语的韵律,韩语的发音,韵律和泰米尔语的发音。本文有助于言语病理学,因为它在英语,韩语和泰米尔语构想中的可理解分类中区分了与语言无关和语言依赖性测量。
translated by 谷歌翻译
本文提出了一种针对英语,韩语和泰米尔语的跨语性分类方法,该方法采用了与语言无关的功能和语言唯一功能。首先,我们从语音质量,发音和韵律等各种语音维度中提取39个特征。其次,应用功能选择来确定每种语言的最佳功能集。通过比较三种语言的特征选择结果来区分一组共享功能和一组独特的功能。最后,使用两个功能集,进行自动严重性分类。值得注意的是,所提出的方法删除了语言的不同特征,以防止其他语言的唯一特征的负面影响。因此,由于其强度归因于缺失的数据,因此采用了极端梯度提升(XGBoost)算法进行分类。为了验证我们提出的方法的有效性,进行了两个基线实验:使用单语言特征集的交点集(交叉路口)和使用单语语言特征集(UNIOM)的联合集合进行实验。根据实验结果,我们的方法以67.14%的F1得分获得更好的性能,而交叉路口实验为64.52%,联合实验为66.74%。此外,所提出的方法比所有三种语言的单语言分类都能获得更好的性能,分别达到17.67%,2.28%,7.79%的相对百分比增加了英语,韩语和泰米尔语。结果规定,必须单独考虑通常共享特征和特定于语言的特征,以进行跨语音质心严重性分类。
translated by 谷歌翻译
在本文中,我们建议利用对话的独特特征,共享参与者的常识性知识,以解决总结它们的困难。我们提出了病态的框架,该框架使用常识推论作为其他背景。与以前仅依赖于输入对话的工作相比,Sick使用外部知识模型来生成丰富的常识推断,并选择具有基于相似性选择方法的最可能的推理。基于生病的,病人++的理解为监督,在总结多任务学习环境中的对话时,添加了产生常识推断的任务。实验结果表明,通过注入常识性知识,我们的框架比现有方法产生更多信息和一致的摘要。
translated by 谷歌翻译
生物膜为海洋科学,生物能源和生物医学等不同领域的工程师带来了重大问题,在那里有效的生物膜控制是一个长期目标。生物膜的粘附和表面力学在产生和去除生物膜中起着至关重要的作用。设计具有不同表面拓扑的定制纳米曲面可以改变粘合剂特性,以更轻松地清除生物膜,从而大大改善长期生物膜控制。为了快速设计此类拓扑结构,我们采用基于个体的建模和贝叶斯优化来自动化设计过程并生成不同的活动表面,以有效地去除生物膜。我们的框架成功地生成了理想的纳米曲面,以通过施加的剪切和振动来清除生物膜。密集分布的短柱形地形是防止生物膜形成的最佳几何形状。在液体剪切下,最佳地形是稀疏分布高,纤细的柱状结构。当受到垂直或横向振动的影响时,发现厚梯形锥是最佳的。优化振动载荷表明频率相对较低的振动幅度在去除生物膜方面更有效。我们的结果为需要表面介导的生物膜控制的各种工程领域提供了见解。我们的框架也可以应用于更通用的材料设计和优化。
translated by 谷歌翻译
图像修复是计算机视觉中的一项重要且具有挑战性的任务。将过滤的图像恢复到其原始图像有助于各种计算机视觉任务。我们采用非线性激活函数网络(NAFNET)进行快速且轻巧的模型,并添加色彩注意模块,以提取有用的颜色信息以提高精确度。我们提出了一个准确,快速,轻巧的网络,具有多尺度和色彩的关注,以进行Instagram滤波器删除(CAIR)。实验结果表明,所提出的CAIR以快速和轻巧的方式优于现有的Instagram滤波器删除网络,约11 $ \ times $快速$ \ times $和2.4 $ \ times $ ipher,而在IFFI数据集上超过3.69 db psnr。CAIR可以通过高质量成功地删除Instagram过滤器,并以定性结果恢复颜色信息。源代码和预处理的权重可在\ url {https://github.com/hnv-lab/cair}上获得。
translated by 谷歌翻译